Mapping with polygons files

There are many approaches to mapping in R. The simplest is just using polygons whose shapes are defined in a data frame. Here is an example with election data which we will look more closely at in a few weeks.

First we load the map data. Note that is a collection of names (MA cities), with coordinates for the corners of each named polygon. It’s non-trivial to make these things, so best is just to find them pre-constructed.

setwd("~/Desktop/Dropbox/bostonography_S2020/readings, lectures, code, data/Week 2/Day 2")
map <- read.csv("map.csv",stringsAsFactors=F)
head(map)
##     place      lat     long group
## 1 Newbury 951401.6 256497.2   0.1
## 2 Newbury 951284.0 256497.0   0.1
## 3 Newbury 951152.6 256523.5   0.1
## 4 Newbury 950778.6 256534.1   0.1
## 5 Newbury 950699.9 256547.5   0.1
## 6 Newbury 950562.9 256548.6   0.1

Next we load in some data about each polygon. In this case, we use election results for the 2012 Presidential contest per MA city.

election <- read.csv("election_candidates.csv",stringsAsFactors=F)
election_12 <- election[election$year==2012 & election$office == "President" & election$candidate=="Obama/ Biden",] 
head(election_12)
##    votes   percent                         key year    office    place
## 1   4144 0.5050579 Obama/ Biden President 2012 2012 President Abington
## 3   7872 0.6701285 Obama/ Biden President 2012 2012 President    Acton
## 5   3046 0.5810759 Obama/ Biden President 2012 2012 President Acushnet
## 7   3013 0.7862735 Obama/ Biden President 2012 2012 President    Adams
## 9   7288 0.5147620 Obama/ Biden President 2012 2012 President   Agawam
## 11   233 0.7845118 Obama/ Biden President 2012 2012 President   Alford
##       candidate
## 1  Obama/ Biden
## 3  Obama/ Biden
## 5  Obama/ Biden
## 7  Obama/ Biden
## 9  Obama/ Biden
## 11 Obama/ Biden

Merge this with the polygon data. (Note that we use the inner_join function instead of merge because inner_join preserves row orders, and ggplot really likes the polygons to remain sorted.)

library(tidyverse)
e_select_mapped <- inner_join(election_12,map)

Now we plot it, using geom_polygon() in ggplot, and we fill the polygons with the Obama percentage on a red-blue spectrum.

library(ggplot2)
ggplot(e_select_mapped) + 
  geom_polygon() + 
  aes(x=long,y=lat,fill=percent,group=group) + 
  scale_fill_gradient2(midpoint=.5,low = "red",high="blue",mid="white")

Hm… Boston is blue, as is the university area in the middle. But why is the rural west blue? We’ll come back to this…

Shape files

Often spatial data that you find on the internet is not specified as tidy polygons, but in “shape” files. Let’s work with a world map shape file found on the internet, as described here: https://www.r-graph-gallery.com/168-load-a-shape-file-into-r.html Note that if the approach in the code block below doesn’t work on your machine, you can always go the URL manually, download the file, and unzip it into your working directory.

# Acquire shape file and unzip it
#download.file("http://thematicmapping.org/downloads/TM_WORLD_BORDERS_SIMPL-0.3.zip" , destfile="world_shape_file.zip")
#system("unzip world_shape_file.zip")

# Install package to handle shape files, and then load it
# install.packages("rgdal")
library(rgdal)
world_spdf <- readOGR( 
  dsn= getwd(), 
  layer="TM_WORLD_BORDERS_SIMPL-0.3",
  verbose=FALSE
)
# We use the "tidy" function in "broom" to convert it to a data frame for ggplot
#install.packages("broom")
library(broom)
world_asdf <- tidy(world_spdf, region = "NAME")

First the basic plot:

ggplot() +
  geom_polygon(data = world_asdf, aes( x = long, y = lat, group = group), fill="#69b3a2", color="white") +
  theme_void() 

Now let’s make a choropleth, or colored map, colored by population per country.

# First clean up our variable of interest, population. We log it to make the colors look less imbalanced, otherwise India and China would be the only dark ones.
# Note that world_spdf is a complex data object where "data" is inside "my_spdf" and referenced by "@", while "data" itself is a data frame, and thus to get the population variable we reference it with "$". So the whole thing is my_spdf@data$POP2005
world_spdf@data$POP2005[ which(world_spdf@data$POP2005 == 0)] <- NA
world_spdf@data$POP2005LG <- log(
  as.numeric(
    as.character(
      world_spdf@data$POP2005)) + 10) 
world_asdf <- tidy(world_spdf, region = "NAME")
world_asdf2 <- merge(world_asdf,world_spdf@data,by.x="id",by.y="NAME")

ggplot(world_asdf2) + 
  geom_polygon() + 
  aes(x=long,y=lat,fill=POP2005LG,group=group) + 
  scale_fill_gradient2(midpoint=.5,low = "red",high="blue",mid="white") +
  labs(fill = "Log pop") +
  theme_void() 

Now let’s do it for Boston. The data are available here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SQ6BT4 as Tracts_Boston_2010_BARI.zip; unzip all the contained files into your wd first.

library(rgdal)
bari_spdf <- readOGR( 
  dsn= getwd(), 
  layer="Tracts_Boston BARI",
  verbose=FALSE
)

# Convert to a dataframe and add back the data as before
library(broom)
bari_asdf <- tidy(bari_spdf, region = "GEOID10")
bari_asdf2 <- merge(bari_asdf,bari_spdf@data,by.x="id",by.y="GEOID10")

ggplot(bari_asdf2) + 
  geom_polygon() + 
  aes(x=long,y=lat,fill=POP100,group=group) + 
  scale_fill_gradient2(midpoint=.5,low = "red",high="blue",mid="white") +
  labs(fill = "Population") +
  theme_void() 

Leaflet package

The leaflet package is designed for HTML and loads map data into an interactive window. It can be used with local shape files and internet map data that is dynamically loaded.

For example, here is the basic package dynamically loading its default, OpenStreetMaps data. (Note that we are now using now the piping function from the “tidyverse” package installed above. This just sends the output of the previous function into the next fuction as its first argument.)

# install.packages("leaflet")
library(leaflet)

# Initialize the leaflet map with the leaflet() function
leaflet() %>% 
  addTiles() %>% 
  setView( lng = -71.06, lat = 42.36, zoom = 10 )

To change map layers, see here: http://leaflet-extras.github.io/leaflet-providers/preview/index.html . For example, here is Google satellite:

leaflet() %>% 
  addTiles() %>% 
  setView( lng = -71.06, lat = 42.36, zoom = 10 ) %>%
   addProviderTiles("Esri.WorldImagery")

Now let’s take our local shapefile data and add that data. This is following the example here: https://www.r-graph-gallery.com/183-choropleth-map-with-leaflet.html . Note that we are adding our local polygons, colored with our local data, on top of the leaflet map info that is dynamically loaded; if you zoom in, you can see the OpenStreetMaps data underneath.

leaflet(world_spdf)%>% addTiles()  %>% setView( lat=10, lng=0 , zoom=2) %>%
  addPolygons( stroke = FALSE, fillOpacity = 0.5, smoothFactor = 0.5, color = ~colorNumeric("YlOrRd", POP2005LG)(POP2005LG) )

Adding point data to maps

Let’s use some point data drawn from Twitter, the geolocations of various tweets. We’ll return to this data later.

Using leaflet:

load(file="boston1k.Rdata") # Tweet locaiton data; dataframe as bostwi

leaflet(bostwi) %>% 
  addTiles() %>% 
  setView( lng = -71.06, lat = 42.36, zoom = 14 ) %>%
  addCircles(~as.numeric(longitude), ~as.numeric(latitude), weight = 3, radius=20, color="blue", stroke = TRUE, fillOpacity = 0.5) 

And now using another package, ggmap, which like leaflet loads google maps, but this time within the ggplot framework, so that we can add other layers on it easily.

#install.packages("ggmap")
library(ggmap)

# map it:
boston_mp <- get_map(location = c(left = -71.15, bottom = 42.3, right = -70.985746, top = 42.4))

g1 <- ggmap(boston_mp) + geom_point(data=bostwi, aes(x=as.numeric(longitude), y=as.numeric(latitude)) )
g1

Eg, here it is using ggplot’s contour layering function:

g1 + stat_density2d(
    aes(x = as.numeric(longitude), y = as.numeric(latitude), fill = ..level.., alpha = 0.05),
    size = 0.01, bins = 30, data = bostwi,
    geom = "polygon"
  )

Cartograms

Just for fun, here’s an example of a cartogram as we discussed in class, a distorted map that rescales regions by some feature, such as their population.

# Install the cartogram package.  They broke the most recent one, so we need the previous version.
# install.packages("devtools")
# library(devtools)
# install_version("cartogram", version = "0.1.1", repos = "http://cran.us.r-project.org")
library(cartogram)
# install.packages("maptools") # to get some data
library(maptools)
data(wrld_simpl)
# Pick out Africa
afr=wrld_simpl[wrld_simpl$REGION==2,]
# Plot as is
plot(afr)

# Cartogram
afr_cartogram <- cartogram(afr, "POP2005", itermax=5)
plot(afr_cartogram)

More useful sites

Here are a few useful websites with examples and tutorials for mapping in R:

https://www.littlemissdata.com/blog/maps

https://rstudio.github.io/leaflet/

https://trendct.org/2015/06/26/tutorial-how-to-put-dots-on-a-leaflet-map-with-r/

https://bookdown.org/robinlovelace/geocompr/adv-map.html

https://socviz.co/maps.html

Especially useful is: https://www.r-graph-gallery.com/map.html This also has many other useful tutorials for all kinds of graphs in R.